Optimal Stopping Rules for Sequential Hypothesis Testing

نویسندگان

  • Constantinos Daskalakis
  • Yasushi Kawase
چکیده

Suppose that we are given sample access to an unknown distribution p over n elements and an explicit distribution q over the same n elements. We would like to reject the null hypothesis “p = q” after seeing as few samples as possible, when p 6= q, while we never want to reject the null, when p = q. Well-known results show that Θ( √ n/ 2) samples are necessary and sufficient for distinguishing whether p equals q versus p is -far from q in total variation distance. However, this requires the distinguishing radius to be fixed prior to deciding how many samples to request. Our goal is instead to design sequential hypothesis testers, i.e. online algorithms that request i.i.d. samples from p and stop as soon as they can confidently reject the hypothesis p = q, without being given a lower bound on the distance between p and q, when p 6= q. In particular, we want to minimize the number of samples requested by our tests as a function of the distance between p and q, and if p = q we want the algorithm, with high probability, to never reject the null. Our work is motivated by and addresses the practical challenge of sequential A/B testing in Statistics. We show that, when n = 2, any sequential hypothesis test must see Ω ( 1 dtv(p,q)2 log log 1 dtv(p,q) ) samples, with high (constant) probability, before it rejects p = q, where dtv(p, q) is the—unknown to the tester—total variation distance between p and q. We match the dependence of this lower bound on dtv(p, q) by proposing a sequential tester that rejects p = q from at most O ( √ n dtv(p,q)2 log log 1 dtv(p,q) ) samples with high (constant) probability. The Ω( √ n) dependence on the support size n is also known to be necessary. We similarly provide two-sample sequential hypothesis testers, when sample access is given to both p and q, and discuss applications to sequential A/B testing. 1998 ACM Subject Classification F.2.2 Computations on discrete structures, G.3 Probability and Statistics

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Optimal Stopping Problems in Sequential Hypothesis Testing

After a brief survey of a variety of optimal stopping problems in sequential testing theory, we give a unified treatment of these problems by introducing a general class of loss functions and prior distributions. In the context of a one-parameter exponential family, this unified treatment leads to relatively simple sequential tests involving generalized likelihood ratio statistics or mixture li...

متن کامل

Multistage Tests ofMultiple Hypotheses

Conventional multiple hypothesis tests use step-up, step-down, or closed testing methods to control the overall error rates. We will discuss marrying these methods with adaptive multistage sampling rules and stopping rules to perform efficient multiple hypothesis testing in sequential experimental designs. The result is a multistage step-down procedure that adaptively tests multiple hypotheses ...

متن کامل

Multisource Bayesian sequential binary hypothesis testing problem

We consider the problem of testing two simple hypotheses about unknown local characteristics of several independent Brownian motions and compound Poisson processes. All of the processes may be observed simultaneously as long as desired before a final choice between hypotheses is made. The objective is to find a decision rule that identifies the correct hypothesis and strikes the optimal balance...

متن کامل

Deriving Stopping Rules for the Probabilistic Hough Transform by Sequential Analysis

It is known that Hough Transform computation can be signiicantly accelerated by polling instead of voting. A small part of the data set is selected at random and used as input to the algorithm. The performance of these Proba-bilistic Hough Transforms depends on the poll size. Most Probabilistic Hough algorithms use a xed poll size, which is far from optimal since conservative design requires th...

متن کامل

Sequential Hypothesis Test with Online Usage-Constrained Sensor Selection

This work investigates the sequential hypothesis testing problem with online sensor selection and sensor usage constraints. That is, in a sensor network, the fusion center sequentially acquires samples by selecting one “most informative” sensor at each time until a reliable decision can be made. In particular, the sensor selection is carried out in the online fashion since it depends on all the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017